{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "a0c79d53",
   "metadata": {},
   "source": [
    "# Create and load knowledge graph backups"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "709ec115",
   "metadata": {},
   "source": [
    "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
    "<div class=\"toc\"><ul class=\"toc-item\">\n",
    "    <li><span><a href=\"#Setup\" data-toc-modified-id=\"Setup-1\">Setup</a></span></li>\n",
    "        <ul class=\"toc-item\">\n",
    "            <li><span><a href=\"#Import-necessary-libraries\" data-toc-modified-id=\"Import-necessary-libraries-1.1\">Import necessary libraries</a></span></li>\n",
    "            <li><span><a href=\"#Define-folder-and-file-names\" data-toc-modified-id=\"Define-folder-and-file-names-1.2\">Define flder and file names</a></span></li>\n",
    "        </ul>\n",
    "    <li><span><a href=\"#Create-backup-files\" data-toc-modified-id=\"Create-backup-files-2\">Create backup files</a></span></li>\n",
    "        <ul class=\"toc-item\">\n",
    "            <li><span><a href=\"#Connect-to-portal-and-knowledge-graph\" data-toc-modified-id=\"Connect-to-portal-and-knowledge-graph-2.1\">Connect to portal and knowledge graph</a></span></li>\n",
    "            <li><span><a href=\"#Write-data-model-entity-types-to-backup-json-file\" data-toc-modified-id=\"Write-data-model-entity-types-to-backup-json-file-2.2\">Write data model entity types to backup json file</a></span></li>\n",
    "            <li><span><a href=\"#Write-data-model-relationship-types-to-backup-json-file\" data-toc-modified-id=\"Write-data-model-relationship-types-to-backup-json-file-2.3\">Write data model relationship types to backup json file</a></span></li>\n",
    "            <li><span><a href=\"#Write-entities-to-backup-json-file\" data-toc-modified-id=\"Write-entities-to-backup-json-file-2.4\">Write entities to backup json file</a></span></li>\n",
    "            <li><span><a href=\"#Write-relationships-to-backup-json-file\" data-toc-modified-id=\"Write-relationships-to-backup-json-file-2.5\">Write relationships to backup json file</a></span></li>\n",
    "            <li><span><a href=\"#OPTIONAL:-Write-provenance-records-to-backup-json-file\" data-toc-modified-id=\"OPTIONAL:-Write-provenance-records-to-backup-json-file-2.6\">OPTIONAL: Write provenance records to backup json file</a></span></li>\n",
    "        </ul>\n",
    "    <li><span><a href=\"#Load-backup-files\" data-toc-modified-id=\"Load-backup-files-3\">Load backup files</a></span></li>\n",
    "        <ul class=\"toc-item\">\n",
    "            <li><span><a href=\"#Connect-to-portal-and-create-knowledge-graph\" data-toc-modified-id=\"Connect-to-portal-and-create-knowledge-graph-3.1\">Connect to portal and create knowledge graph</a></span></li>\n",
    "            <li><span><a href=\"#Populate-data-model-from-saved-json-files\" data-toc-modified-id=\"Populate-data-model-from-saved-json-files-3.2\">Populate data model from saved json files</a></span></li>\n",
    "            <li><span><a href=\"#Get-original-document-names-to-correctly-load-data\" data-toc-modified-id=\"Get-original-document-names-to-correctly-load-data-3.3\">Get original document names to correctly load data</a></span></li>\n",
    "            <li><span><a href=\"#Add-additional-document-entity-and-relationship-type-properties\" data-toc-modified-id=\"Add-additional-document-entity-and-relationship-type-properties-3.4\">Add additional document entity and relationship type properties</a></span></li>\n",
    "            <li><span><a href=\"#Get-list-of-date-properties-from-the-data-model-json-files\" data-toc-modified-id=\"Get-list-of-date-properties-from-the-data-model-json-files-3.5\">Get list of date properties from the data model json files</a></span></li>\n",
    "            <li><span><a href=\"#Add-all-entities-to-the-knowledge-graph\" data-toc-modified-id=\"Add-all-entities-to-the-knowledge-graph-3.6\">Add all entities to the knowledge graph</a></span></li>\n",
    "            <li><span><a href=\"#Add-all-relationships-to-the-knowledge-graph\" data-toc-modified-id=\"Add-all-relationships-to-the-knowledge-graph-3.7\">Add all relationships to the knowledge graph</a></span></li>\n",
    "            <li><span><a href=\"#Add-search-indexes-to-all-text-properties\" data-toc-modified-id=\"Add-search-indexes-to-all-text-properties-3.8\">Add search indexes to all text properties</a></span></li>\n",
    "            <li><span><a href=\"#OPTIONAL:-Add-provenance-records-to-the-knowledge-graph\" data-toc-modified-id=\"OPTIONAL:-Add-provenance-records-to-the-knowledge-graph-3.9\">OPTIONAL: Add provenance records to the knowledge graph</a></span></li>\n",
    "        </ul>\n",
    "</ul></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6c9288de",
   "metadata": {},
   "source": [
    "_**Never lose your graphs again with backups!**_\n",
    "\n",
    "Create a backup of an ArcGIS Knowledge graph you have created to store your data model and data. The backup will be in the format of json files that can read to recreate your graph or be shared for others to be able to create the same graph on a different server.\n",
    "\n",
    "_**Load a new graph easily from your backup files!**_\n",
    "\n",
    "Use the backup files with the load backup files portion of this notebook to create a graph with the same data model and data."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b373d48b",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d20f5e6c",
   "metadata": {},
   "source": [
    "### Import necessary libraries\n",
    "Start by importing the libraries we need for connecting to portal, accessing knowledge graph, and manipulating data as needed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dd42438c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# imports\n",
    "import os, json, requests\n",
    "from datetime import datetime\n",
    "from uuid import UUID\n",
    "\n",
    "from arcgis.gis import GIS\n",
    "from arcgis.graph import KnowledgeGraph"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b14f1c6e",
   "metadata": {},
   "source": [
    "### Define folder and file names\n",
    "Define all names for files so we can stay consistent for backing up and loading files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9dea238",
   "metadata": {},
   "outputs": [],
   "source": [
    "# output folder name\n",
    "output_folder = r\"C:\\backups\\myknowledgegraph_backup\"\n",
    "\n",
    "# output backup json file names\n",
    "dm_ent = \"datamodel_entities.json\"\n",
    "dm_rel = \"datamodel_relationships.json\"\n",
    "dm_prov = \"datamodel_provenance.json\"\n",
    "all_ent = \"all_entities.json\"\n",
    "all_rel = \"all_relationships.json\"\n",
    "prov_file = \"provenance_entities.json\"  # this will only be used if you want to backup provenance records\n",
    "servicedef = \"service_definition.json\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8efc688e",
   "metadata": {},
   "source": [
    "## Create backup files"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b58d0f34",
   "metadata": {},
   "source": [
    "### Connect to portal and knowledge graph\n",
    "Connect to the portal and knowledge graph on that portal, also ensure knowledge graph service exists (url is correct)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a5c95926",
   "metadata": {},
   "outputs": [],
   "source": [
    "gis_backup = GIS(\"home\")  # connect to portal\n",
    "# connect to knowledge graph service\n",
    "knowledgegraph_backup = KnowledgeGraph(\n",
    "    \"https://myportal.com/server/rest/services/Hosted/myknowledgegraph/KnowledgeGraphServer\",\n",
    "    gis=gis_backup,\n",
    ")\n",
    "try:\n",
    "    knowledgegraph_backup.datamodel\n",
    "except:\n",
    "    raise Exception(\"Knowledge graph to backup does not exist\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17968c25",
   "metadata": {},
   "source": [
    "Get the service definition for the knowledge graph service and save it to the service definition file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8c5275cb",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create a token for request\n",
    "token_url = f\"https://myportal.com/portal/sharing/rest/generateToken\"\n",
    "creds = {\n",
    "    \"username\": \"myUsername\",\n",
    "    \"password\": \"myPassword\",\n",
    "    \"referer\": \"https://myportal.com/portal\",\n",
    "    \"f\": \"json\",\n",
    "}\n",
    "token_response = requests.post(token_url, data=creds, verify=False)\n",
    "sd_data = requests.get(\n",
    "    url=\"https://myportal.com/server/rest/services/Hosted/myknowledgegraph/KnowledgeGraphServer\",\n",
    "    params={\"f\": \"json\", \"token\": token_response.json()[\"token\"]},\n",
    "    verify=False,\n",
    ")\n",
    "with open(os.path.join(output_folder, servicedef), \"w\") as f:\n",
    "    json.dump(sd_data.text, f)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ced3a37e",
   "metadata": {},
   "source": [
    "### Write data model entity types to backup json file\n",
    "Iterate through the data model of the knowledge graph to write all entity type objects to a backup json file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a52591c0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create list of formatted entity types\n",
    "entity_types = []\n",
    "for types in knowledgegraph_backup.datamodel[\"entity_types\"]:\n",
    "    curr_entity_type = {\n",
    "        \"name\": knowledgegraph_backup.datamodel[\"entity_types\"][types][\"name\"],\n",
    "        \"properties\": knowledgegraph_backup.datamodel[\"entity_types\"][types][\n",
    "            \"properties\"\n",
    "        ],\n",
    "    }\n",
    "    entity_types.append(curr_entity_type)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d73069ba",
   "metadata": {},
   "outputs": [],
   "source": [
    "# write entity types to json file\n",
    "with open(os.path.join(output_folder, dm_ent), \"w\") as f:\n",
    "    json.dump(entity_types, f)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dff06f42",
   "metadata": {},
   "source": [
    "### Write data model relationship types to backup json file\n",
    "Iterate through the data model of the knowledge graph to write all relationship type objects to a backup json file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "83c275c0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create list of formatted relationship types\n",
    "relationship_types = []\n",
    "for types in knowledgegraph_backup.datamodel[\"relationship_types\"]:\n",
    "    curr_relationship_type = {\n",
    "        \"name\": knowledgegraph_backup.datamodel[\"relationship_types\"][types][\"name\"],\n",
    "        \"properties\": knowledgegraph_backup.datamodel[\"relationship_types\"][types][\n",
    "            \"properties\"\n",
    "        ],\n",
    "    }\n",
    "    relationship_types.append(curr_relationship_type)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cee22fea",
   "metadata": {},
   "outputs": [],
   "source": [
    "# write relationship types to json file\n",
    "with open(os.path.join(output_folder, dm_rel), \"w\") as f:\n",
    "    json.dump(relationship_types, f)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd6f9f53",
   "metadata": {},
   "source": [
    "### Write entities to backup json file\n",
    "Get all entities from the knowledge graph using a streaming query, clean them up for better writing when loading, and save the resulting entities to a json backup file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9d1bbe7f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# query for all entities in graph\n",
    "original_entities = knowledgegraph_backup.query_streaming(\"MATCH (n) RETURN distinct n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bc90a817",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create list of formatted entities to add to the graph\n",
    "all_entities_fromquery = []\n",
    "for entity in list(original_entities):\n",
    "    curr_entity = entity[0]\n",
    "    # convert UUID values to a string since json can't store UUIDs\n",
    "    curr_entity[\"_id\"] = str(curr_entity[\"_id\"])\n",
    "    for prop in curr_entity[\"_properties\"]:\n",
    "        if type(curr_entity[\"_properties\"][prop]) == UUID:\n",
    "            curr_entity[\"_properties\"][prop] = str(curr_entity[\"_properties\"][prop])\n",
    "    # delete objectid, the server will handle creating new ones when we load the backup\n",
    "    del curr_entity[\"_properties\"][\"objectid\"]\n",
    "    all_entities_fromquery.append(curr_entity)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2760a47b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# write entities list to json file\n",
    "with open(os.path.join(output_folder, all_ent), \"w\") as f:\n",
    "    json.dump(all_entities_fromquery, f)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53d50a65",
   "metadata": {},
   "source": [
    "### Write relationships to backup json file\n",
    "Get all relationships from the knowledge graph using a streaming query, clean them up for better writing when loading, and save the resulting relationships to a json backup file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0bf7c90b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# query for all relationships in graph\n",
    "original_relationships = knowledgegraph_backup.query_streaming(\n",
    "    \"MATCH ()-[rel]->() RETURN distinct rel\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b4835acc",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create list of formatted entities to add to the graph\n",
    "all_relationships_fromquery = []\n",
    "for relationship in list(original_relationships):\n",
    "    curr_relationship = relationship[0]\n",
    "    # convert UUID values to a string since json can't store UUIDs\n",
    "    curr_relationship[\"_id\"] = str(curr_relationship[\"_id\"])\n",
    "    curr_relationship[\"_originEntityId\"] = str(curr_relationship[\"_originEntityId\"])\n",
    "    curr_relationship[\"_destinationEntityId\"] = str(\n",
    "        curr_relationship[\"_destinationEntityId\"]\n",
    "    )\n",
    "    for prop in curr_relationship[\"_properties\"]:\n",
    "        if type(curr_relationship[\"_properties\"][prop]) == UUID:\n",
    "            curr_relationship[\"_properties\"][prop] = str(\n",
    "                curr_relationship[\"_properties\"][prop]\n",
    "            )\n",
    "    # delete objectid, the server will handle creating new ones when we load the backup\n",
    "    del curr_relationship[\"_properties\"][\"objectid\"]\n",
    "    all_relationships_fromquery.append(curr_relationship)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ffb27ca4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# write relationships list to json file\n",
    "with open(os.path.join(output_folder, all_rel), \"w\") as f:\n",
    "    json.dump(all_relationships_fromquery, f)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be75a27f",
   "metadata": {},
   "source": [
    "## OPTIONAL: Write provenance records to backup json file\n",
    "If you have provenance records that you want to maintain in the backups, this will get all provenance records and save them to a json backup file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a2769e71",
   "metadata": {},
   "outputs": [],
   "source": [
    "# write provenance information to the provenance data model file\n",
    "prov_structure = knowledgegraph_backup.datamodel[\"meta_entity_types\"][\"Provenance\"]\n",
    "with open(os.path.join(output_folder, dm_prov), \"w\") as f:\n",
    "    json.dump(prov_structure, f)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3e8576d4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# query for all provenance records in the graph\n",
    "provenance_entities = knowledgegraph_backup.query_streaming(\n",
    "    \"MATCH (n:Provenance) RETURN distinct n\", include_provenance=True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "402ca4ce",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create list of formatted provenance records to the graph\n",
    "all_provenance_fromquery = []\n",
    "for entity in list(provenance_entities):\n",
    "    curr_provenance = entity[0]\n",
    "    # convert UUID values to a string since json can't store UUIDs\n",
    "    curr_provenance[\"_id\"] = str(curr_provenance[\"_id\"])\n",
    "    for prop in curr_provenance[\"_properties\"]:\n",
    "        if type(curr_provenance[\"_properties\"][prop]) == UUID:\n",
    "            curr_provenance[\"_properties\"][prop] = str(\n",
    "                curr_provenance[\"_properties\"][prop]\n",
    "            )\n",
    "    # delete objectid, the server will handle creating new ones when we load the backup\n",
    "    del curr_provenance[\"_properties\"][\"objectid\"]\n",
    "    all_provenance_fromquery.append(curr_provenance)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e00efce3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# write provenance list to json file\n",
    "with open(os.path.join(output_folder, prov_file), \"w\") as f:\n",
    "    json.dump(all_provenance_fromquery, f)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c21c8066",
   "metadata": {},
   "source": [
    "## Load backup files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a74ddfe3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load data model json files into graph data model\n",
    "with open(os.path.join(output_folder, dm_ent), \"r\") as file:\n",
    "    dm_ents = json.load(file)\n",
    "with open(os.path.join(output_folder, dm_rel), \"r\") as file:\n",
    "    dm_rels = json.load(file)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "85163b0a",
   "metadata": {},
   "source": [
    "### Get original documents names to correctly load data\n",
    "This section will get the names of the document entity and relationship types from the original graph data, using the above commands the document types get their default names of 'Document' and 'HasDocument'. Just in case they have been named differently in the original knowledge graph, this steps allows us to match them up in later loading steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1dd1cce2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# get document entity type name\n",
    "doc_type_name = \"Document\"\n",
    "for entity_type in dm_ents:\n",
    "    for prop in entity_type[\"properties\"]:\n",
    "        if entity_type[\"properties\"][prop][\"role\"] == \"esriGraphNamedObjectDocument\":\n",
    "            doc_type_name = entity_type[\"name\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0c3eaba3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# get document relationship type name\n",
    "doc_rel_type_name = \"HasDocument\"\n",
    "for relationship_type in dm_rels:\n",
    "    for prop in relationship_type[\"properties\"]:\n",
    "        if (\n",
    "            relationship_type[\"properties\"][prop][\"role\"]\n",
    "            == \"esriGraphNamedObjectDocument\"\n",
    "        ):\n",
    "            doc_rel_type_name = relationship_type[\"name\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db4a59d8",
   "metadata": {},
   "source": [
    "### Connect to portal and create knowledge graph\n",
    "Connect to the portal and create a new knowledge graph service to load data model and data into"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dd042d14",
   "metadata": {},
   "outputs": [],
   "source": [
    "# connect to portal via GIS\n",
    "gis_load = GIS(\"home\")\n",
    "\n",
    "with open(os.path.join(output_folder, servicedef), \"r\") as file:\n",
    "    service_def = json.load(file)\n",
    "service_def = json.loads(service_def)\n",
    "updated_service_def = {\n",
    "    \"name\": \"myNewKnowledgeGraph\",  # replace with the name for your new knowledge graph\n",
    "    \"capabilities\": service_def[\"capabilities\"],\n",
    "    \"jsonProperties\": {\n",
    "        \"allowGeometryUpdates\": service_def[\"allowGeometryUpdates\"],\n",
    "        \"searchMaxRecordCount\": service_def[\"searchMaxRecordCount\"],\n",
    "        \"spatialReference\": service_def[\"spatialReference\"],\n",
    "        \"maxRecordCount\": service_def[\"maxRecordCount\"],\n",
    "        \"description\": service_def[\"description\"],\n",
    "        \"copyrightText\": service_def[\"copyrightText\"],\n",
    "        \"documentEntityTypeInfo\": {\n",
    "            \"documentEntityTypeName\": doc_type_name,\n",
    "            \"hasDocumentsRelationshipTypeName\": doc_rel_type_name,\n",
    "        },\n",
    "        \"supportsDocuments\": service_def[\"supportsDocuments\"],\n",
    "        \"supportsSearch\": service_def[\"supportsSearch\"],\n",
    "        \"supportsProvenance\": service_def[\"supportsProvenance\"],\n",
    "    },\n",
    "}\n",
    "\n",
    "result = gis_load.content.create_service(\n",
    "    name=\"\", service_type=\"KnowledgeGraph\", create_params=updated_service_def\n",
    ")\n",
    "\n",
    "knowledgegraph_load = KnowledgeGraph(result.url, gis=gis_load)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "70b11b8a",
   "metadata": {},
   "source": [
    "### Populate data model from saved json files\n",
    "Populate entity and relationship types from saved json files. This is using the variables defined in the 'Setup' section above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad6f6d63",
   "metadata": {},
   "outputs": [],
   "source": [
    "# populate entity and relationship types based on data model (errors about Document/HasDocument in the output are expected)\n",
    "knowledgegraph_load.named_object_type_adds(\n",
    "    entity_types=dm_ents, relationship_types=dm_rels\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd1eecc7",
   "metadata": {},
   "source": [
    "### Add additional document entity and relationship type properties\n",
    "In the case additional properties exist in the original graph on document entity or relationship types, since the type was created at the time of knowledge graph creation we need to additional properties using graph_property_adds."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c14b97c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load any additional document entity type properties (errors about properties that already exist are expected)\n",
    "origin_document_properties = None\n",
    "for entity_type in dm_ents:\n",
    "    if entity_type[\"name\"] == doc_type_name:\n",
    "        origin_document_properties = entity_type[\"properties\"]\n",
    "prop_list = []\n",
    "for prop in origin_document_properties:\n",
    "    prop_list.append(origin_document_properties[prop])\n",
    "knowledgegraph_load.graph_property_adds(\n",
    "    type_name=doc_type_name, graph_properties=prop_list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4aead427",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load any additional document relationship type properties (errors about properties that already exist are expected)\n",
    "for relationship_type in dm_rels:\n",
    "    if relationship_type[\"name\"] == doc_rel_type_name:\n",
    "        origin_document_rel_properties = relationship_type[\"properties\"]\n",
    "prop_list = []\n",
    "for prop in origin_document_rel_properties:\n",
    "    prop_list.append(origin_document_rel_properties[prop])\n",
    "knowledgegraph_load.graph_property_adds(\n",
    "    type_name=doc_rel_type_name, graph_properties=prop_list\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "31027810",
   "metadata": {},
   "source": [
    "### Get list of date properties from the data model json files\n",
    "From the data model type json files, find and make a list of date files so we can correctly create datetime objects when loading the data into the knowledge graph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7cecd788",
   "metadata": {},
   "outputs": [],
   "source": [
    "date_properties = []\n",
    "# add date property names for entity types\n",
    "for types in dm_ents:\n",
    "    for prop in types[\"properties\"]:\n",
    "        if types[\"properties\"][prop][\"fieldType\"] == \"esriFieldTypeDate\":\n",
    "            date_properties.append(prop)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4d13d12b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# add date property names for relationship types\n",
    "for types in dm_rels:\n",
    "    for prop in types[\"properties\"]:\n",
    "        if types[\"properties\"][prop][\"fieldType\"] == \"esriFieldTypeDate\":\n",
    "            date_properties.append(prop)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "33eca61c",
   "metadata": {},
   "source": [
    "### Add all entities to the knowledge graph\n",
    "Add all entities from json file to the knowledge graph, formatting UUIDs, dates, and documents before loading the values. This loads the data in batches of 20,000 entities."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d959142e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load entities json file\n",
    "with open(os.path.join(output_folder, all_ent), \"r\") as file:\n",
    "    original_entities = json.load(file)\n",
    "batch = []\n",
    "for curr_entity in original_entities:\n",
    "    # if a batch reaches 20k records, apply that batch of edits to the knowledge graph\n",
    "    if len(batch) > 20000:\n",
    "        result = knowledgegraph_load.apply_edits(adds=batch)\n",
    "        batch = []\n",
    "        # print error if one occurs during edit operation\n",
    "        try:\n",
    "            print(result[\"error\"])\n",
    "        except:\n",
    "            print(\"No error adding entities\")\n",
    "    # in case original document type name is different, change name to Document\n",
    "    if curr_entity[\"_typeName\"] == doc_type_name:\n",
    "        curr_entity[\"_typeName\"] = \"Document\"\n",
    "    # format UUID and date properties\n",
    "    for prop in curr_entity[\"_properties\"]:\n",
    "        if prop in date_properties:\n",
    "            try:\n",
    "                curr_entity[\"_properties\"][prop] = datetime.fromtimestamp(\n",
    "                    int(curr_entity[\"_properties\"][prop] / 1000)\n",
    "                )\n",
    "            except:\n",
    "                curr_entity[\"_properties\"][prop] = None\n",
    "        try:\n",
    "            curr_entity[\"_properties\"][prop] = UUID(curr_entity[\"_properties\"][prop])\n",
    "        except:\n",
    "            continue\n",
    "    # format id UUID\n",
    "    curr_entity[\"_id\"] = UUID(curr_entity[\"_id\"])\n",
    "    batch.append(curr_entity)\n",
    "# apply final batch of edits to the knowledge graph\n",
    "result = knowledgegraph_load.apply_edits(adds=batch)\n",
    "# print error if one occurs during edit operation\n",
    "try:\n",
    "    print(result[\"error\"])\n",
    "except:\n",
    "    print(\"No error adding entities\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "034ed628",
   "metadata": {},
   "source": [
    "### Add all relationships to the knowledge graph\n",
    "Add all relationships from json file to the knowledge graph, formatting UUIDs, dates, and documents before loading the values. This loads the data in batches of 20,000 relationships."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "573343e9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load relationships json file\n",
    "with open(os.path.join(output_folder, all_rel), \"r\") as file:\n",
    "    original_rels = json.load(file)\n",
    "batch = []\n",
    "for curr_relationship in original_rels:\n",
    "    # if a batch reaches 20k records, apply that batch of edits to the knowledge graph\n",
    "    if len(batch) > 20000:\n",
    "        result = knowledgegraph_load.apply_edits(adds=batch)\n",
    "        batch = []\n",
    "        # print error if one occurs during edit operation\n",
    "        try:\n",
    "            print(result[\"error\"])\n",
    "        except:\n",
    "            print(\"No error adding relationships\")\n",
    "    # in case original document type name is different, change name to HasDocument\n",
    "    if curr_relationship[\"_typeName\"] == doc_rel_type_name:\n",
    "        curr_relationship[\"_typeName\"] = \"HasDocument\"\n",
    "    # format UUID and date properties\n",
    "    for prop in curr_relationship[\"_properties\"]:\n",
    "        if prop in date_properties:\n",
    "            try:\n",
    "                curr_relationship[\"_properties\"][prop] = datetime.fromtimestamp(\n",
    "                    int(curr_relationship[\"_properties\"][prop] / 1000)\n",
    "                )\n",
    "            except:\n",
    "                curr_relationship[\"_properties\"][prop] = None\n",
    "        try:\n",
    "            curr_relationship[\"_properties\"][prop] = UUID(\n",
    "                curr_relationship[\"_properties\"][prop]\n",
    "            )\n",
    "        except:\n",
    "            continue\n",
    "    # format other relationship specific UUIDs\n",
    "    curr_relationship[\"_id\"] = UUID(curr_relationship[\"_id\"])\n",
    "    curr_relationship[\"_originEntityId\"] = UUID(curr_relationship[\"_originEntityId\"])\n",
    "    curr_relationship[\"_destinationEntityId\"] = UUID(\n",
    "        curr_relationship[\"_destinationEntityId\"]\n",
    "    )\n",
    "    batch.append(curr_relationship)\n",
    "# apply final batch of edits to the knowledge graph\n",
    "result = knowledgegraph_load.apply_edits(adds=batch)\n",
    "# print error if one occurs during edit operation\n",
    "try:\n",
    "    print(result[\"error\"])\n",
    "except:\n",
    "    print(\"No error adding relationships\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7d67f4ed",
   "metadata": {},
   "source": [
    "### Add search indexes to all text properties\n",
    "Adding search indexes to all text properties will allow for the values in those properties to be searched for when using search in any client."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1e542d1c",
   "metadata": {},
   "outputs": [],
   "source": [
    "load_dm = knowledgegraph_load.datamodel\n",
    "# add search indexes for all entity text properties\n",
    "for entity_type in load_dm[\"entity_types\"]:\n",
    "    prop_list = []\n",
    "    for prop in load_dm[\"entity_types\"][entity_type][\"properties\"]:\n",
    "        if (\n",
    "            load_dm[\"entity_types\"][entity_type][\"properties\"][prop][\"fieldType\"]\n",
    "            == \"esriFieldTypeString\"\n",
    "        ):\n",
    "            prop_list.append(prop)\n",
    "    knowledgegraph_load.update_search_index(\n",
    "        adds={entity_type: {\"property_names\": prop_list}}\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "29234c25",
   "metadata": {},
   "outputs": [],
   "source": [
    "# add search indexes for all relationship text properties\n",
    "for relationship_type in load_dm[\"relationship_types\"]:\n",
    "    prop_list = []\n",
    "    for prop in load_dm[\"relationship_types\"][relationship_type][\"properties\"]:\n",
    "        if (\n",
    "            load_dm[\"relationship_types\"][relationship_type][\"properties\"][prop][\n",
    "                \"fieldType\"\n",
    "            ]\n",
    "            == \"esriFieldTypeString\"\n",
    "        ):\n",
    "            prop_list.append(prop)\n",
    "    knowledgegraph_load.update_search_index(\n",
    "        adds={relationship_type: {\"property_names\": prop_list}}\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0694de09",
   "metadata": {},
   "source": [
    "## OPTIONAL: Add provenance records to the knowledge graph\n",
    "This will only apply if you created a backup of provenance records and have enabled provenance on your knowledge graph service."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa2099c5",
   "metadata": {},
   "source": [
    "### Add additional provenance entity type properties\n",
    "In the case additional properties exist in the original graph on the provenance type, we need to add those properties to the data model using graph_property_adds."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c2fddb86",
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(os.path.join(output_folder, dm_prov), \"r\") as file:\n",
    "    prov_dm = json.load(file)\n",
    "\n",
    "prop_list = []\n",
    "for prop in prov_dm[\"properties\"]:\n",
    "    prop_list.append(prov_dm[\"properties\"][prop])\n",
    "knowledgegraph_load.graph_property_adds(\n",
    "    type_name=\"Provenance\", graph_properties=prop_list\n",
    ")\n",
    "# errors about already existing properties are expected"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13b25b06",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load provenance records json file\n",
    "with open(os.path.join(output_folder, prov_file), \"r\") as file:\n",
    "    prov_entities = json.load(file)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "973c889f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# add all provenance records\n",
    "for curr_prov in prov_entities:\n",
    "    # format UUID properties\n",
    "    for prop in curr_prov[\"_properties\"]:\n",
    "        try:\n",
    "            curr_prov[\"_properties\"][prop] = UUID(curr_prov[\"_properties\"][prop])\n",
    "        except:\n",
    "            continue\n",
    "    # format id as UUID\n",
    "    curr_prov[\"_id\"] = UUID(curr_prov[\"_id\"])\n",
    "    # add provenance record\n",
    "    knowledgegraph_load.apply_edits(adds=[curr_prov])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
